Code to estimate noise shares between ASX beta SOFIA and synthetic SOFIA¶

  • NB: We use the RBA replicated version of base SOFIA and the synthetic versions to ensure our results are not affected by the replication process
  • Other versions of SOFIA are retained for comparison but estimation is carried out on RBA replicated rates
  • Sample goes from 2022-01-04 to 2025-03-19 with four dates where SOFIA is not computed directly due to lack of transactions

Authors: James Brugler Calebe De Roure Marta Khomyn Max Prakoso Tails Putnins

In [1]:
# Load requried packages and print versions
import sys
import pandas as pd
import numpy as np
import statsmodels.api as sm
lowess = sm.nonparametric.lowess
import datetime as dt
import multiprocessing as mp
import warnings
import matplotlib.pyplot as plt
import matplotlib
from matplotlib.dates import YearLocator, MonthLocator, DateFormatter
import matplotlib.lines as mlines
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

warnings.filterwarnings('ignore', category=RuntimeWarning)

print('Python version: ', sys.version) # dt and mp are part of standard python library
print('Pandas version: ', pd.__version__)
print('Numpy version: ', np.__version__)
print('Statsmodels version: ', sm.__version__)

pathdata = 'Data/'
In [3]:
# Load the cash-rate data as published on RBA website
dfcash = pd.read_excel(pathdata + 'f01d.xlsx',
                        skiprows = 10)

dfcash.columns = [x.lower() for x in dfcash.columns]

dictcolsrba = {'series id' : 'date',
               'firmmcrtd' : 'onr',
               'firmmcrid' : 'aonia'}

dfcash = dfcash[dictcolsrba.keys()]
dfcash.columns = [dictcolsrba[x] for x in dictcolsrba.keys()]
In [4]:
# Load the ASX data
dfasx = pd.read_excel(pathdata + 'sofia-beta-version_april_2024_corected.xlsx',
                        skiprows = 4,
                        skipfooter = 10)

dfasx.columns = [x.lower() for x in dfasx.columns]

dictcolsasx = {'date' : 'date',
               'sofia overnight (%) vwap' : 'sofia_vwap_asx',
               'sofia overnight  (%) volume weighted median' : 'sofia_median_asx'}

dfasx = dfasx[dictcolsasx.keys()]
dfasx.columns = [dictcolsasx[x] for x in dictcolsasx.keys()]
In [5]:
# Make a single dataframe with both refrates 
dffin = pd.merge(dfcash,
                 dfasx,
                 on = 'date',
                 how = 'right')
In [6]:
dffin.describe()
Out[6]:
onr aonia sofia_vwap_asx sofia_median_asx
count 807.000000 807.000000 807.000000 807.000000
mean 3.252416 3.223234 3.230842 3.229641
std 1.484464 1.493535 1.515500 1.516166
min 0.100000 0.040000 0.006326 0.010000
25% 2.600000 2.560000 2.544225 2.550000
50% 4.100000 4.070000 4.092313 4.100000
75% 4.350000 4.320000 4.339667 4.340000
max 4.350000 4.340000 4.404321 4.400000
In [7]:
# Load the new RBA data
dfrba = pd.read_excel(pathdata + 'allSOFIAslonger.xlsx')
dfrba.columns = [x.lower().replace('sofia_','sofia_vwap_').replace('sofiamedian','sofia_median') for x in dfrba.columns]

rbavwapcols = [x for x in dfrba.columns if x.find('vwap')>-1]
rbamediancols = [x for x in dfrba.columns if x.find('median')>-1]

rbacols = ['date']
rbacols.extend(rbavwapcols[:])
rbacols.extend(rbamediancols)
    
dffin = pd.merge(dffin,
                 dfrba[rbacols],
                 on = 'date',
                 how = 'left')
display(dffin.describe())
onr aonia sofia_vwap_asx sofia_median_asx sofia_vwap_rba sofia_vwap_0000 sofia_vwap_0040 sofia_vwap_2525 sofia_vwap_2525_2mn sofia_vwap_0525 sofia_vwap_0525_2mn sofia_vwap_relparty sofia_median_rba sofia_median_0000 sofia_median_0040 sofia_median_2525 sofia_median_2525_2mn sofia_median_0525 sofia_median_0525_2mn sofia_median_relparty
count 807.000000 807.000000 807.000000 807.000000 803.000000 803.000000 803.000000 796.000000 796.000000 800.000000 800.000000 799.000000 803.000000 803.000000 803.000000 796.000000 796.000000 800.000000 800.000000 799.000000
mean 3.252416 3.223234 3.230842 3.229641 3.231961 3.208980 3.237651 3.208022 3.208065 3.221637 3.221657 3.228171 3.230741 3.222522 3.235405 3.209334 3.209410 3.222675 3.222713 3.227372
std 1.484464 1.493535 1.515500 1.516166 1.516501 1.521034 1.516800 1.519735 1.519744 1.518471 1.518460 1.520115 1.517160 1.517355 1.516847 1.520262 1.520306 1.519564 1.519562 1.520153
min 0.100000 0.040000 0.006326 0.010000 0.006492 -0.061389 0.014591 -0.005351 -0.004947 -0.002261 -0.001895 0.001329 0.010000 0.010000 0.010000 0.000000 0.000000 0.010000 0.010000 0.010000
25% 2.600000 2.560000 2.544225 2.550000 2.545580 2.524428 2.552440 2.523751 2.523772 2.536070 2.537901 2.544416 2.550000 2.530000 2.550000 2.530000 2.530000 2.550000 2.550000 2.550000
50% 4.100000 4.070000 4.092313 4.100000 4.092760 4.070573 4.099065 4.074365 4.074373 4.086702 4.086702 4.094379 4.100000 4.080000 4.100000 4.070000 4.075000 4.090000 4.090000 4.100000
75% 4.350000 4.320000 4.339667 4.340000 4.339666 4.325986 4.343170 4.328685 4.328689 4.334538 4.334543 4.340921 4.340000 4.330000 4.340000 4.330000 4.330000 4.340000 4.340000 4.340000
max 4.350000 4.340000 4.404321 4.400000 4.404321 4.393291 4.410237 4.384996 4.384994 4.393798 4.393807 4.406417 4.400000 4.400000 4.400000 4.390000 4.390000 4.400000 4.400000 4.400000
In [8]:
# Load older RBA-replicated SOFIA and check it all lines up
dfsofiarep = pd.read_excel(pathdata + 'sofia_replicate.xlsx')
dfsofiarep.columns = [x.lower() for x in dfsofiarep.columns]
dictcolsrba = {'sofiamedian' : 'sofia_median_rbachck', 
               'sofia' : 'sofia_vwap_rbachck'}
dfsofiarep.rename(columns = dictcolsrba, inplace = True)


dffin = pd.merge(dffin,
                 dfsofiarep[['date','sofia_vwap_rbachck','sofia_median_rbachck']],
                 on = 'date',
                 how = 'left')

plt.scatter(dffin.sofia_vwap_asx, dffin.sofia_vwap_rba)
Out[8]:
<matplotlib.collections.PathCollection at 0x27192a26370>
In [9]:
# Load older related party SOFIA file and again check it lines up 
dfsofiarelparty = pd.read_excel(pathdata + 'SOFIARelParty.xlsx')
dfsofiarelparty.columns = [x.lower() for x in dfsofiarelparty.columns]
dictcolsrba = {'sofiamedian' : 'sofia_vwap_relparty_check', 
               'sofia' : 'sofia_median_relparty_check'}

dfsofiarelparty.rename(columns = dictcolsrba, inplace = True)
dffin = pd.merge(dffin,
                 dfsofiarelparty[['date','sofia_vwap_relparty_check','sofia_median_relparty_check']],
                 on = 'date',
                 how = 'left')

plt.scatter(dffin.sofia_vwap_relparty, dffin.sofia_vwap_relparty_check)
Out[9]:
<matplotlib.collections.PathCollection at 0x271929066d0>
In [10]:
# Examine first five and last five rows
display(dffin.head())
display(dffin.tail())
date onr aonia sofia_vwap_asx sofia_median_asx sofia_vwap_rba sofia_vwap_0000 sofia_vwap_0040 sofia_vwap_2525 sofia_vwap_2525_2mn sofia_vwap_0525 sofia_vwap_0525_2mn sofia_vwap_relparty sofia_median_rba sofia_median_0000 sofia_median_0040 sofia_median_2525 sofia_median_2525_2mn sofia_median_0525 sofia_median_0525_2mn sofia_median_relparty sofia_vwap_rbachck sofia_median_rbachck sofia_vwap_relparty_check sofia_median_relparty_check
0 2022-01-04 0.1 0.04 0.015187 0.010 0.015187 -0.005466 0.016484 0.01 0.01 0.012037 0.012049 0.015187 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.015187 0.01 0.01 0.015187
1 2022-01-05 0.1 0.04 0.016911 0.020 0.016911 0.003159 0.018639 0.01 0.01 0.014611 0.014627 0.016680 0.02 0.01 0.02 0.01 0.01 0.01 0.01 0.02 0.016911 0.02 0.02 0.016680
2 2022-01-06 0.1 0.04 0.017234 0.010 0.017234 -0.020012 0.019043 0.01 0.01 0.013633 0.013633 0.015464 0.01 0.01 0.02 0.01 0.01 0.01 0.01 0.01 0.017234 0.01 0.01 0.015464
3 2022-01-07 0.1 0.04 0.017647 0.020 0.017647 -0.009158 0.019559 0.01 0.01 0.013813 0.013817 0.016611 0.02 0.01 0.02 0.01 0.01 0.01 0.01 0.01 0.017647 0.02 0.01 0.016611
4 2022-01-10 0.1 0.04 0.017760 0.015 0.017794 -0.004272 0.019743 0.01 0.01 0.015361 0.015368 0.017710 0.01 0.01 0.02 0.01 0.01 0.01 0.01 0.01 0.017794 0.01 0.01 0.017710
date onr aonia sofia_vwap_asx sofia_median_asx sofia_vwap_rba sofia_vwap_0000 sofia_vwap_0040 sofia_vwap_2525 sofia_vwap_2525_2mn sofia_vwap_0525 sofia_vwap_0525_2mn sofia_vwap_relparty sofia_median_rba sofia_median_0000 sofia_median_0040 sofia_median_2525 sofia_median_2525_2mn sofia_median_0525 sofia_median_0525_2mn sofia_median_relparty sofia_vwap_rbachck sofia_median_rbachck sofia_vwap_relparty_check sofia_median_relparty_check
802 2025-03-13 4.1 4.09 4.134282 4.13 4.134282 4.109091 4.136029 4.129060 4.129064 4.129060 4.129064 4.135223 4.13 4.13 4.14 4.13 4.13 4.13 4.13 4.13 NaN NaN NaN NaN
803 2025-03-14 4.1 4.09 4.135620 4.14 4.135595 4.120111 4.136994 4.130000 4.130000 4.130000 4.130000 4.135913 4.14 4.13 4.14 4.13 4.13 4.13 4.13 4.14 NaN NaN NaN NaN
804 2025-03-17 4.1 4.09 4.136397 4.14 4.136411 4.127051 4.138014 4.130000 4.130000 4.130000 4.130000 4.136716 4.14 4.13 4.14 4.13 4.13 4.13 4.13 4.14 NaN NaN NaN NaN
805 2025-03-18 4.1 4.09 4.135447 4.14 4.135447 4.127629 4.136809 4.130000 4.130000 4.130000 4.130000 4.135597 4.14 4.13 4.14 4.13 4.13 4.13 4.13 4.14 NaN NaN NaN NaN
806 2025-03-19 4.1 4.09 4.134332 4.13 4.134332 4.119904 4.135703 4.129619 4.129626 4.129619 4.129626 4.134758 4.13 4.13 4.13 4.13 4.13 4.13 4.13 4.13 NaN NaN NaN NaN
In [11]:
# Examine the sum-stats for the data
display(dffin.describe().T)
count mean std min 25% 50% 75% max
onr 807.0 3.252416 1.484464 0.100000 2.600000 4.100000 4.350000 4.350000
aonia 807.0 3.223234 1.493535 0.040000 2.560000 4.070000 4.320000 4.340000
sofia_vwap_asx 807.0 3.230842 1.515500 0.006326 2.544225 4.092313 4.339667 4.404321
sofia_median_asx 807.0 3.229641 1.516166 0.010000 2.550000 4.100000 4.340000 4.400000
sofia_vwap_rba 803.0 3.231961 1.516501 0.006492 2.545580 4.092760 4.339666 4.404321
sofia_vwap_0000 803.0 3.208980 1.521034 -0.061389 2.524428 4.070573 4.325986 4.393291
sofia_vwap_0040 803.0 3.237651 1.516800 0.014591 2.552440 4.099065 4.343170 4.410237
sofia_vwap_2525 796.0 3.208022 1.519735 -0.005351 2.523751 4.074365 4.328685 4.384996
sofia_vwap_2525_2mn 796.0 3.208065 1.519744 -0.004947 2.523772 4.074373 4.328689 4.384994
sofia_vwap_0525 800.0 3.221637 1.518471 -0.002261 2.536070 4.086702 4.334538 4.393798
sofia_vwap_0525_2mn 800.0 3.221657 1.518460 -0.001895 2.537901 4.086702 4.334543 4.393807
sofia_vwap_relparty 799.0 3.228171 1.520115 0.001329 2.544416 4.094379 4.340921 4.406417
sofia_median_rba 803.0 3.230741 1.517160 0.010000 2.550000 4.100000 4.340000 4.400000
sofia_median_0000 803.0 3.222522 1.517355 0.010000 2.530000 4.080000 4.330000 4.400000
sofia_median_0040 803.0 3.235405 1.516847 0.010000 2.550000 4.100000 4.340000 4.400000
sofia_median_2525 796.0 3.209334 1.520262 0.000000 2.530000 4.070000 4.330000 4.390000
sofia_median_2525_2mn 796.0 3.209410 1.520306 0.000000 2.530000 4.075000 4.330000 4.390000
sofia_median_0525 800.0 3.222675 1.519564 0.010000 2.550000 4.090000 4.340000 4.400000
sofia_median_0525_2mn 800.0 3.222713 1.519562 0.010000 2.550000 4.090000 4.340000 4.400000
sofia_median_relparty 799.0 3.227372 1.520153 0.010000 2.550000 4.100000 4.340000 4.400000
sofia_vwap_rbachck 761.0 3.175133 1.537583 0.006492 2.527913 4.089460 4.339163 4.404321
sofia_median_rbachck 761.0 3.173922 1.538279 0.010000 2.530000 4.090000 4.340000 4.400000
sofia_vwap_relparty_check 757.0 3.170026 1.541341 0.010000 2.510000 4.090000 4.340000 4.400000
sofia_median_relparty_check 757.0 3.170770 1.541259 0.001329 2.524884 4.090650 4.340249 4.406417
In [12]:
# Dictionaries for labelling plots and lists to analyze
"""
These map variable names in the dataframes to actual variable names
"""
dictlabels = {'aonia' : 'AONIA',
              'sofia_vwap_asx' : 'SOFIA (VWAP) ASX',
              'sofia_vwap_0525_2mn' : 'SOFIA (VWAP) 25-5 & trimmed',
              'sofia_vwap_0525' : 'SOFIA (VWAP) 25-5',
              'sofia_vwap_2525_2mn' : 'SOFIA (VWAP) 25-25 & trimmed',
              'sofia_vwap_2525' : 'SOFIA (VWAP) 25-25',            
              'sofia_vwap_0040' : 'SOFIA (VWAP) 0-40',            
              'sofia_vwap_0000' : 'SOFIA (VWAP) No filter',
              'sofia_vwap_relparty' : 'SOFIA (VWAP) Exclude Related Party',
              'sofia_median_asx' : 'SOFIA (Median) ASX',
              'sofia_median_0525_2mn' : 'SOFIA (median) 25-5 & trimmed',
              'sofia_median_0525' : 'SOFIA (median) 25-5',
              'sofia_median_2525_2mn' : 'SOFIA (median) 25-25 & trimmed',
              'sofia_median_2525' : 'SOFIA (median) 25-25',
              'sofia_median_0000' : 'SOFIA (Median) No filter',
              'sofia_median_0040' : 'SOFIA (Median) 0-40',            
              'sofia_median_relparty' : 'SOFIA (median) Exclude Related Party',              
              'sofia_vwap_rba' : 'SOFIA (VWAP) RBA Replication',
              'sofia_median_rba' : 'SOFIA (median) RBA Replication'
             }

sofiaalt_vwap = list(dictlabels.keys())[2:9]
sofiaalt_median = list(dictlabels.keys())[10:17]
sofia_vwaprbachck = list(dictlabels.keys())[-2:-1]
sofia_medianrbachck = list(dictlabels.keys())[-1:]
In [13]:
# Choose what we want the first rate to be for computing noise shares
ref1 = 'sofia_vwap_rba'
# ref1 = 'sofia_vwap_asx'

# Choose what we want to compare this rate to when computing noise shares  
"""
Compute the noise share for "ref1" against each of these alternatives
"""
sofiaalt = ['sofia_vwap_0525','sofia_vwap_0525_2mn',
            'sofia_vwap_2525','sofia_vwap_2525_2mn',
            'sofia_vwap_relparty']
            
# sofiaalt = sofiaalt_vwap[:]
# sofiaalt.extend(sofia_vwaprbachck)
In [14]:
# Find if we are missing a daily rate for any relevant variable
colschck = ['onr','aonia','sofia_vwap_asx','sofia_vwap_rba']
colschck.extend(sofiaalt)
xmsngdate = dffin[colschck].isnull().sum(axis = 1) >= 1
dfchck = dffin[xmsngdate].copy()
xmsng = dffin.isnull()
dfchck.fillna('-', inplace = True)

# dfchck.to_csv(pathdir + 'Dropbox/papers/Reference_Rates/RBA/Data/Output/' + 
#               'sofia_missing_dates_by_rate.csv',
#               index = False)
In [15]:
dfchck
Out[15]:
date onr aonia sofia_vwap_asx sofia_median_asx sofia_vwap_rba sofia_vwap_0000 sofia_vwap_0040 sofia_vwap_2525 sofia_vwap_2525_2mn sofia_vwap_0525 sofia_vwap_0525_2mn sofia_vwap_relparty sofia_median_rba sofia_median_0000 sofia_median_0040 sofia_median_2525 sofia_median_2525_2mn sofia_median_0525 sofia_median_0525_2mn sofia_median_relparty sofia_vwap_rbachck sofia_median_rbachck sofia_vwap_relparty_check sofia_median_relparty_check
143 2022-07-29 1.35 1.31 1.290131 1.28 - - - - - - - - - - - - - - - - - - - -
186 2022-09-30 2.35 2.31 2.286827 2.28 - - - - - - - - - - - - - - - - - - - -
396 2023-08-04 4.10 4.07 4.087776 4.10 - - - - - - - - - - - - - - - - - - - -
397 2023-08-07 4.10 4.07 4.070000 4.07 4.07 4.07 4.07 - - - - - 4.07 4.07 4.07 - - - - - 4.07 4.07 - -
436 2023-09-29 4.10 4.07 4.070000 4.07 4.07 4.07 4.07 - - - - - 4.07 4.07 4.07 - - - - - 4.07 4.07 - -
437 2023-10-02 4.10 4.07 4.070000 4.07 4.07 4.07 4.07 - - - - - 4.07 4.07 4.07 - - - - - 4.07 4.07 - -
596 2024-05-21 4.35 4.32 4.337480 4.33 4.33748 4.323801 4.33935 - - 4.332223 4.332224 4.337763 4.33 4.33 4.33 - - 4.33 4.33 4.33 4.33748 4.33 4.33 4.337763
648 2024-08-02 4.35 4.34 4.316768 4.34 4.316768 4.300076 4.33346 4.25 4.25 4.25 4.25 - 4.34 4.34 4.34 4.25 4.25 4.25 4.25 - 4.316768 4.34 - -
649 2024-08-05 4.35 4.34 4.262413 4.25 4.262413 4.25931 4.265516 - - 4.255882 4.255882 4.25 4.25 4.25 4.25 - - 4.25 4.25 4.25 4.262413 4.25 4.25 4.25
671 2024-09-04 4.35 4.34 4.384976 4.38 4.384976 4.373672 4.38622 - - 4.381431 4.381432 4.385763 4.38 4.38 4.38 - - 4.38 4.38 4.38 4.384976 4.38 4.38 4.385763
693 2024-10-04 4.35 4.34 4.383924 4.38 - - - - - - - - - - - - - - - - - - - -
699 2024-10-15 4.35 4.34 4.386033 4.38 4.386033 4.375134 4.387541 - - 4.380863 4.380865 4.386242 4.38 4.38 4.38 - - 4.38 4.38 4.38 4.386033 4.38 4.38 4.386242
In [16]:
# Decide whether to round results
"""
roundflag == 1 : Do round
roundsig : How many sig figures
"""
roundflag, roundsig = 1, 4 
In [17]:
# Decide how to deal with missing results for the relevant rates
"""
'ffill' = forward fill from last valid, 
'asx' = take value from ASX SOFIA that day
'rba' = take value from RBA SOFIA that day
'ref1' = take value from whichever rate is the base rate (helps with not having to change code in two places)
'drop' = remove any date where any value is missing
'dropallbutrelparty' = remove any date where any value is missing except for in related party 
'ignore' (or any other value) , leave NA and KF sorts it out
"""

fillna = 'rba'

colsestimate = [ref1[:]]
colsestimate.extend(sofiaalt)

if fillna == 'ffill':
    dffin.loc[:,colsestimate] = dffin[colsestimate].fillna(method = 'ffill')
    
elif fillna == 'asx':
    for col in colsestimate:
        xmsng = dffin[col].isnull()
        dffin.loc[xmsng,col] = dffin.loc[xmsng,'sofia_vwap_asx']

elif fillna == 'ref1':
    for col in colsestimate:
        xmsng = dffin[col].isnull()
        dffin.loc[xmsng,col] = dffin.loc[xmsng,ref1]

elif fillna == 'rba':
    for col in colsestimate:
        xmsng = dffin[col].isnull()
        dffin.loc[xmsng,col] = dffin.loc[xmsng,'sofia_vwap_rba']
    
elif fillna == 'drop':
    xdrop = dffin[colsestimate[:-1]].isnull().sum(axis=1)>=1
    dffin = dffin[xdrop == False].reset_index(drop = True)
In [18]:
# Show final dataframe sum stats 
"""
Note we are missing four values for RBA SOFIA
"""
display(dffin[colsestimate].describe().T)
count mean std min 25% 50% 75% max
sofia_vwap_rba 803.0 3.231961 1.516501 0.006492 2.545580 4.092760 4.339666 4.404321
sofia_vwap_0525 803.0 3.224807 1.516512 -0.002261 2.537859 4.086545 4.334498 4.393798
sofia_vwap_0525_2mn 803.0 3.224827 1.516502 -0.001895 2.538861 4.086547 4.334501 4.393807
sofia_vwap_2525 803.0 3.216895 1.516103 -0.005351 2.525071 4.074461 4.328784 4.386033
sofia_vwap_2525_2mn 803.0 3.216937 1.516112 -0.004947 2.525094 4.074464 4.328801 4.386033
sofia_vwap_relparty 803.0 3.232672 1.517674 0.001329 2.545864 4.094269 4.340810 4.406417
In [19]:
dffin[['date','onr','sofia_vwap_asx',
       'sofia_vwap_rba',
       'sofia_vwap_0525',
       'sofia_vwap_2525',
       'sofia_vwap_relparty']].tail(10)
Out[19]:
date onr sofia_vwap_asx sofia_vwap_rba sofia_vwap_0525 sofia_vwap_2525 sofia_vwap_relparty
797 2025-03-06 4.1 4.138908 4.138656 4.131723 4.128115 4.140496
798 2025-03-07 4.1 4.137586 4.137586 4.132763 4.129531 4.138215
799 2025-03-10 4.1 4.133575 4.133575 4.127781 4.127781 4.134387
800 2025-03-11 4.1 4.133549 4.133549 4.128946 4.128946 4.134449
801 2025-03-12 4.1 4.134301 4.134301 4.129814 4.129814 4.134780
802 2025-03-13 4.1 4.134282 4.134282 4.129060 4.129060 4.135223
803 2025-03-14 4.1 4.135620 4.135595 4.130000 4.130000 4.135913
804 2025-03-17 4.1 4.136397 4.136411 4.130000 4.130000 4.136716
805 2025-03-18 4.1 4.135447 4.135447 4.130000 4.130000 4.135597
806 2025-03-19 4.1 4.134332 4.134332 4.129619 4.129619 4.134758
In [20]:
dffin[['date','onr','sofia_vwap_asx','sofia_vwap_rba','sofia_vwap_0525','sofia_vwap_relparty']].isnull().sum()
Out[20]:
date                   0
onr                    0
sofia_vwap_asx         0
sofia_vwap_rba         4
sofia_vwap_0525        4
sofia_vwap_relparty    4
dtype: int64
In [21]:
for ref2 in sofiaalt:   
    
    # Plot the data less cash rate target
    fig = plt.figure(figsize = (7, 3))
    ax1 = fig.add_subplot(1,1,1)
    
    if roundflag == 0:
        l1 = ax1.plot(dffin.date, 
                      dffin[ref1] - dffin['onr'], 'b-',
                      dffin.date,
                      dffin[ref2] - dffin['onr'], 'r--',
                      lw= 1, markersize = 3)       
    elif roundflag == 1:
        l1 = ax1.plot(dffin.date, 
                      dffin[ref1] - dffin['onr'], 'b-',
                      dffin.date,
                      np.round(dffin[ref2],2) - dffin['onr'], 'r--',
                      lw= 1, markersize = 3)       
        
    yearsFmt = DateFormatter('%b-%Y')
    years = YearLocator()
    months = MonthLocator()
    ax1.autoscale_view()
    ax1.xaxis.set_tick_params(labelbottom=True,
                             rotation = 90)
    ax1.xaxis.set_major_formatter(yearsFmt)
    ax1.xaxis.set_minor_locator(months)
    ax1.grid(True, linestyle = 'dashed')
    ax1.set_ylabel('Yield (%)')
    
    aonia_leg= mlines.Line2D([], [], color='blue', lw = 1,
                             linestyle = '-', label = dictlabels[ref1])
    sofia_leg = mlines.Line2D([], [], color='red', lw = 1,
                              linestyle = '--', label = dictlabels[ref2])
    
    plt.legend(handles=[aonia_leg, sofia_leg])
In [22]:
# Create data for MP state-space estimation
"""
For each alternative reference rate, create a Tx2 vector of simultaneously observed rates and estimate the 
main state-space model (with simultaneous publication/observation) to estimate noise variances
"""
dictdata, dictcntrl, dictdf = {}, {}, {}
estlistall = []

T = len(dffin)

for ref2 in sofiaalt:
    y = np.empty((T,2))

    if roundflag == 0:
        y[:,0] = dffin[ref1]
        y[:,1] = dffin[ref2]
    elif roundflag == 1:
        y[:,0] = np.round(dffin[ref1], decimals = 2)
        y[:,1] = np.round(dffin[ref2], decimals = 2)
        
    
    x = dffin[['onr']].values.reshape((T,1))
    
    dictdata[ref2] = y
    dictcntrl[ref2] = x
    dictdf[ref2] = dffin.copy()

    x = dictcntrl[ref2]
    xonr = x[:,-1]
    onrvar = x[:,-1].var()
    y = dictdata[ref2]
    datesccy = dictdf[ref2].date

    estlistall.append(['full', ref1, ref2, y, xonr, onrvar, datesccy])
In [23]:
# Estimate each model in parallel using all but four cores 
"""
stsp_mods contains code to estimate the state space model as per our 
noiseshares computes noise shares 
"""
from stsp_mods import stspestmp_contemp
from noiseshares import noiseshares_contemp

mp_cores = mp.cpu_count() - 4
pool = mp.Pool(mp_cores)
if __name__ ==  '__main__': 
    ssbyperccy = pool.map(stspestmp_contemp, estlistall)
pool.close()    
dfreginf = pd.concat(ssbyperccy) 
In [24]:
# Compute and print noise shares for each alternative
print(' ')
print('Noise shares (contemporaneous model):')
print('Ref 1 = ' + ref1 + ', Ref 2 = Alternative')
print(' ')
dfis = noiseshares_contemp(dfreginf, sofiaalt)
display(dfis.T)
 
Noise shares (contemporaneous model):
Ref 1 = sofia_vwap_rba, Ref 2 = Alternative
 
Ref 1 NS Ref 2 NS
sofia_vwap_0525 0.5770 0.4230
sofia_vwap_0525_2mn 0.5729 0.4271
sofia_vwap_2525 0.5181 0.4819
sofia_vwap_2525_2mn 0.5170 0.4830
sofia_vwap_relparty 0.5523 0.4477
In [25]:
print('Wald test p-value for equal noise variances:')
for ref2 in sofiaalt:
    waldtest_n = dfreginf[dfreginf.alt == ref2].iloc[0]['wald_test']
    print(f'Ref 1 = {ref1}, Ref 2 = {ref2}; pval = {np.round(waldtest_n.pvalue,4)}')
#     display(np.round(waldtest_n.pvalue,4))
Wald test p-value for equal noise variances:
Ref 1 = sofia_vwap_rba, Ref 2 = sofia_vwap_0525; pval = 0.0
Ref 1 = sofia_vwap_rba, Ref 2 = sofia_vwap_0525_2mn; pval = 0.0
Ref 1 = sofia_vwap_rba, Ref 2 = sofia_vwap_2525; pval = 0.0977
Ref 1 = sofia_vwap_rba, Ref 2 = sofia_vwap_2525_2mn; pval = 0.1213
Ref 1 = sofia_vwap_rba, Ref 2 = sofia_vwap_relparty; pval = 0.0031
In [26]:
print('Parameter estimates (contemporaneous model):')
for ref2 in sofiaalt:
    print(' ')
    print('Ref 1 = ' + ref1 + ', Ref 2 = ' + ref2)
    print(' ')
    display(dfreginf[dfreginf.alt == ref2].iloc[0]['ressum'])
Parameter estimates (contemporaneous model):
 
Ref 1 = sofia_vwap_rba, Ref 2 = sofia_vwap_0525
 
Statespace Model Results
Dep. Variable: ['y1', 'y2'] No. Observations: 807
Model: StSpMod_wCntrls Log Likelihood 5368.776
Date: Tue, 17 Feb 2026 AIC -10723.553
Time: 14:28:49 BIC -10690.699
Sample: 0 HQIC -10710.937
- 807
Covariance Type: opg
coef std err z P>|z| [0.025 0.975]
state_var 5.557e-05 1.32e-06 42.117 0.000 5.3e-05 5.82e-05
obs_var1 3.642e-05 7.71e-07 47.258 0.000 3.49e-05 3.79e-05
obs_var2 2.67e-05 9.12e-07 29.290 0.000 2.49e-05 2.85e-05
ref1ref2_spread 0.0072 0.001 8.281 0.000 0.006 0.009
betavar0_1 0.9936 0.006 157.109 0.000 0.981 1.006
betavar0_2 0.9936 0.006 157.998 0.000 0.981 1.006
Ljung-Box (L1) (Q): 11.19, 15.02 Jarque-Bera (JB): 48446.81, 70657.33
Prob(Q): 0.00, 0.00 Prob(JB): 0.00, 0.00
Heteroskedasticity (H): 1.05, 0.96 Skew: 2.61, -3.85
Prob(H) (two-sided): 0.66, 0.71 Kurtosis: 40.62, 48.22


Warnings:
[1] Covariance matrix calculated using the outer product of gradients (complex-step).
 
Ref 1 = sofia_vwap_rba, Ref 2 = sofia_vwap_0525_2mn
 
Statespace Model Results
Dep. Variable: ['y1', 'y2'] No. Observations: 807
Model: StSpMod_wCntrls Log Likelihood 5366.335
Date: Tue, 17 Feb 2026 AIC -10718.669
Time: 14:28:49 BIC -10685.816
Sample: 0 HQIC -10706.054
- 807
Covariance Type: opg
coef std err z P>|z| [0.025 0.975]
state_var 5.603e-05 1.32e-06 42.368 0.000 5.34e-05 5.86e-05
obs_var1 3.615e-05 7.73e-07 46.796 0.000 3.46e-05 3.77e-05
obs_var2 2.695e-05 9.22e-07 29.234 0.000 2.51e-05 2.88e-05
ref1ref2_spread 0.0071 0.001 8.184 0.000 0.005 0.009
betavar0_1 0.9936 0.006 156.265 0.000 0.981 1.006
betavar0_2 0.9935 0.006 157.178 0.000 0.981 1.006
Ljung-Box (L1) (Q): 12.31, 16.16 Jarque-Bera (JB): 48623.03, 69350.92
Prob(Q): 0.00, 0.00 Prob(JB): 0.00, 0.00
Heteroskedasticity (H): 1.05, 0.94 Skew: 2.61, -3.84
Prob(H) (two-sided): 0.67, 0.64 Kurtosis: 40.69, 47.79


Warnings:
[1] Covariance matrix calculated using the outer product of gradients (complex-step).
 
Ref 1 = sofia_vwap_rba, Ref 2 = sofia_vwap_2525
 
Statespace Model Results
Dep. Variable: ['y1', 'y2'] No. Observations: 807
Model: StSpMod_wCntrls Log Likelihood 5170.690
Date: Tue, 17 Feb 2026 AIC -10327.381
Time: 14:28:49 BIC -10294.528
Sample: 0 HQIC -10314.765
- 807
Covariance Type: opg
coef std err z P>|z| [0.025 0.975]
state_var 3.715e-05 1.27e-06 29.252 0.000 3.47e-05 3.96e-05
obs_var1 5.493e-05 1.34e-06 41.045 0.000 5.23e-05 5.76e-05
obs_var2 5.11e-05 1.52e-06 33.696 0.000 4.81e-05 5.41e-05
ref1ref2_spread 0.0142 0.001 10.534 0.000 0.012 0.017
betavar0_1 0.9971 0.006 179.487 0.000 0.986 1.008
betavar0_2 0.9968 0.005 181.564 0.000 0.986 1.008
Ljung-Box (L1) (Q): 0.60, 79.04 Jarque-Bera (JB): 24038.54, 11071.29
Prob(Q): 0.44, 0.00 Prob(JB): 0.00, 0.00
Heteroskedasticity (H): 1.02, 1.32 Skew: 1.81, -2.22
Prob(H) (two-sided): 0.85, 0.02 Kurtosis: 29.51, 20.60


Warnings:
[1] Covariance matrix calculated using the outer product of gradients (complex-step).
 
Ref 1 = sofia_vwap_rba, Ref 2 = sofia_vwap_2525_2mn
 
Statespace Model Results
Dep. Variable: ['y1', 'y2'] No. Observations: 807
Model: StSpMod_wCntrls Log Likelihood 5175.599
Date: Tue, 17 Feb 2026 AIC -10337.199
Time: 14:28:49 BIC -10304.346
Sample: 0 HQIC -10324.583
- 807
Covariance Type: opg
coef std err z P>|z| [0.025 0.975]
state_var 3.784e-05 1.29e-06 29.278 0.000 3.53e-05 4.04e-05
obs_var1 5.402e-05 1.32e-06 41.028 0.000 5.14e-05 5.66e-05
obs_var2 5.046e-05 1.52e-06 33.252 0.000 4.75e-05 5.34e-05
ref1ref2_spread 0.0141 0.001 10.531 0.000 0.011 0.017
betavar0_1 0.9970 0.006 179.294 0.000 0.986 1.008
betavar0_2 0.9967 0.005 181.359 0.000 0.986 1.008
Ljung-Box (L1) (Q): 0.37, 75.23 Jarque-Bera (JB): 24404.32, 11578.30
Prob(Q): 0.54, 0.00 Prob(JB): 0.00, 0.00
Heteroskedasticity (H): 1.02, 1.31 Skew: 1.83, -2.27
Prob(H) (two-sided): 0.85, 0.03 Kurtosis: 29.71, 21.01


Warnings:
[1] Covariance matrix calculated using the outer product of gradients (complex-step).
 
Ref 1 = sofia_vwap_rba, Ref 2 = sofia_vwap_relparty
 
Statespace Model Results
Dep. Variable: ['y1', 'y2'] No. Observations: 807
Model: StSpMod_wCntrls Log Likelihood 5425.334
Date: Tue, 17 Feb 2026 AIC -10836.668
Time: 14:28:49 BIC -10803.815
Sample: 0 HQIC -10824.052
- 807
Covariance Type: opg
coef std err z P>|z| [0.025 0.975]
state_var 7.084e-05 1.31e-06 54.251 0.000 6.83e-05 7.34e-05
obs_var1 2.706e-05 8.08e-07 33.483 0.000 2.55e-05 2.86e-05
obs_var2 2.193e-05 9.86e-07 22.238 0.000 2e-05 2.39e-05
ref1ref2_spread 0.0014 0.001 1.380 0.168 -0.001 0.003
betavar0_1 0.9964 0.007 141.514 0.000 0.983 1.010
betavar0_2 0.9972 0.007 141.438 0.000 0.983 1.011
Ljung-Box (L1) (Q): 33.19, 8.55 Jarque-Bera (JB): 56215.75, 333307.94
Prob(Q): 0.00, 0.00 Prob(JB): 0.00, 0.00
Heteroskedasticity (H): 1.11, 0.42 Skew: 2.32, -5.73
Prob(H) (two-sided): 0.39, 0.00 Kurtosis: 43.65, 101.96


Warnings:
[1] Covariance matrix calculated using the outer product of gradients (complex-step).
In [ ]: